19 research outputs found

    Gradient Descent Ascent for Min-Max Problems on Riemannian Manifolds

    Full text link
    In the paper, we study a class of useful non-convex minimax optimization problems on Riemanian manifolds and propose a class of Riemanian gradient descent ascent algorithms to solve these minimax problems. Specifically, we propose a new Riemannian gradient descent ascent (RGDA) algorithm for the \textbf{deterministic} minimax optimization. Moreover, we prove that the RGDA has a sample complexity of O(κ2ϵ−2)O(\kappa^2\epsilon^{-2}) for finding an ϵ\epsilon-stationary point of the nonconvex strongly-concave minimax problems, where κ\kappa denotes the condition number. At the same time, we introduce a Riemannian stochastic gradient descent ascent (RSGDA) algorithm for the \textbf{stochastic} minimax optimization. In the theoretical analysis, we prove that the RSGDA can achieve a sample complexity of O(κ3ϵ−4)O(\kappa^3\epsilon^{-4}). To further reduce the sample complexity, we propose a novel momentum variance-reduced Riemannian stochastic gradient descent ascent (MVR-RSGDA) algorithm based on the momentum-based variance-reduced technique of STORM. We prove that the MVR-RSGDA algorithm achieves a lower sample complexity of O~(κ(3−ν/2)ϵ−3)\tilde{O}(\kappa^{(3-\nu/2)}\epsilon^{-3}) for ν≥0\nu \geq 0, which reaches the best known sample complexity for its Euclidean counterpart. Extensive experimental results on the robust deep neural networks training over Stiefel manifold demonstrate the efficiency of our proposed algorithms.Comment: 32 pages. We have updated the theoretical results of our methods in this new revision. E.g., our MVR-RSGDA algorithm achieves a lower sample complexity. arXiv admin note: text overlap with arXiv:2008.0817

    EffConv: Efficient Learning of Kernel Sizes for Convolution Layers of CNNs

    No full text
    Determining kernel sizes of a CNN model is a crucial and non-trivial design choice and significantly impacts its performance. The majority of kernel size design methods rely on complex heuristic tricks or leverage neural architecture search that requires extreme computational resources. Thus, learning kernel sizes, using methods such as modeling kernels as a combination of basis functions, jointly with the model weights has been proposed as a workaround. However, previous methods cannot achieve satisfactory results or are inefficient for large-scale datasets. To fill this gap, we design a novel efficient kernel size learning method in which a size predictor model learns to predict optimal kernel sizes for a classifier given a desired number of parameters. It does so in collaboration with a kernel predictor model that predicts the weights of the kernels - given kernel sizes predicted by the size predictor - to minimize the training objective, and both models are trained end-to-end. Our method only needs a small fraction of the training epochs of the original CNN to train these two models and find proper kernel sizes for it. Thus, it offers an efficient and effective solution for the kernel size learning problem. Our extensive experiments on MNIST, CIFAR-10, STL-10, and ImageNet-32 demonstrate that our method can achieve the best training time vs. accuracy trade-off compared to previous kernel size learning methods and significantly outperform them on challenging datasets such as STL-10 and ImageNet-32. Our implementations are available at https://github.com/Alii-Ganjj/EffConv

    Video Recovery via Learning Variation and Consistency of Images

    No full text
    Matrix completion algorithms have been popularly used to recover images with missing entries, and they are proved to be very effective. Recent works utilized tensor completion models in video recovery assuming that all video frames are homogeneous and correlated. However, real videos are made up of different episodes or scenes, i.e. heterogeneous. Therefore, a video recovery model which utilizes both video spatiotemporal consistency and variation is necessary. To solve this problem, we propose a new video recovery method Sectional Trace Norm with Variation and Consistency Constraints (STN-VCC). In our model, capped L1-norm regularization is utilized to learn the spatial-temporal consistency and variation between consecutive frames in video clips. Meanwhile, we introduce a new low-rank model to capture the low-rank structure in video frames with a better approximation of rank minimization than traditional trace norm. An efficient optimization algorithm is proposed, and we also provide a proof of convergence in the paper. We evaluate the proposed method via several video recovery tasks and experiment results show that our new method consistently outperforms other related approaches

    Discriminative Multi-instance Multitask Learning for 3D Action Recognition

    No full text

    A Tmprss2-CreERT2 Knock-In Mouse Model for Cancer Genetic Studies on Prostate and Colon.

    No full text
    Fusion between TMPRSS2 and ERG, placing ERG under the control of the TMPRSS2 promoter, is the most frequent genetic alteration in prostate cancer, present in 40-50% of cases. The fusion event is an early, if not initiating, event in prostate cancer, implicating the TMPRSS2-positive prostate epithelial cell as the cancer cell of origin in fusion-positive prostate cancer. To introduce genetic alterations into Tmprss2-positive cells in mice in a temporal-specific manner, we generated a Tmprss2-CreERT2 knock-in mouse. We found robust tamoxifen-dependent Cre activation in the prostate luminal cells but not basal epithelial cells, as well as epithelial cells of the bladder and gastrointestinal (GI) tract. The knock-in allele on the Tmprss2 locus does not noticeably impact prostate, bladder, or gastrointestinal function. Deletion of Pten in Tmprss2-positive cells of adult mice generated neoplasia only in the prostate, while deletion of Apc in these cells generated neoplasia only in the GI tract. These results suggest that this new Tmprss2-CreERT2 mouse model will be a useful resource for genetic studies on prostate and colon
    corecore